Skip to content

fix: stream database dumps to support large databases (#59)#129

Open
xyaz1313 wants to merge 1 commit intoouterbase:mainfrom
xyaz1313:fix/streaming-dump-large-databases
Open

fix: stream database dumps to support large databases (#59)#129
xyaz1313 wants to merge 1 commit intoouterbase:mainfrom
xyaz1313:fix/streaming-dump-large-databases

Conversation

@xyaz1313
Copy link
Copy Markdown

Problem

Fixes #59

The endpoint loads ALL table data into memory before sending the response. For large databases (1GB+), this causes:

  1. Out-of-memory crashes — entire DB is materialized in a string
  2. 30-second timeout — Cloudflare Workers kills long-running requests

Solution

Rewrote to use **** with LIMIT/OFFSET batching:

  • Fetches rows in batches of 500 ( constant)
  • Streams statements incrementally via
  • Yields control between batches (10ms breathing interval) to avoid starving other concurrent requests
  • Proper SQL escaping: identifiers with double-quotes, values with correct handling for strings, NULL, blobs, booleans
  • Peak memory is now O(BATCH_SIZE) instead of O(database_size)

Changes

  • — complete rewrite using streaming
  • — updated tests + new tests for NULL handling and multi-batch streaming

Testing

All 25 export tests pass:

Design Decisions

  1. Batch size 500 — small enough to fit comfortably in memory, large enough to minimize query round-trips
  2. 10ms breathing interval — prevents the Durable Object from being locked during long exports
  3. **** — wraps table names in double-quotes to handle special characters safely
  4. Streaming via ReadableStream — standard Web API supported by Cloudflare Workers, no extra dependencies

Previously, dumpDatabaseRoute loaded ALL table data into memory before
sending the response. For databases >1GB this caused:
1. Out-of-memory crashes
2. 30-second Cloudflare Workers timeout

This commit rewrites the dump to use ReadableStream with LIMIT/OFFSET
batching:

- Fetches rows in batches of 500 (constant BATCH_SIZE)
- Streams INSERT statements incrementally via ReadableStream
- Yields control between batches (10ms breathing interval) to avoid
  starving other concurrent requests
- Escapes identifiers with double-quotes and values with proper SQL
  quoting (strings, NULL, blobs, booleans)
- Peak memory is now O(BATCH_SIZE) instead of O(database_size)

All existing tests pass, plus new tests for NULL handling and
multi-batch streaming.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Database dumps do not work on large databases

1 participant